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Abstract 

We establish a large deviation principle for the empirical spec¬ 
tral measure of a sample covariance matrix with sub-Gaussian entries, 
which extends Bordenave and Caputo’s result for Wigner matrices hav¬ 
ing the same type of entries [7]. To this aim, we need to establish an 
asymptotic freeness result for rectangular free convolution, more pre¬ 
cisely, we give a bound in the subordination formula for information- 
plus-noise matrices. 
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1 Introduction 


Throughout this paper, V{E) will denote the set of probability measures on 
a space E, (resp. M.n,piE-)) the set of n x p real (resp. complex) 

matrices, T-LniC) the set of n x n Hermitian matrices, (resp. ^4*) the 
transpose (resp. transconjugate) of a matrix A, and Tr(A) its trace. Besides, 
for a random variable X, X denotes the centred variable X — lE(JT). Finally, 
for two real numbers x, y, we denote hy x Ay the minimum of x and y. 


1.1 Large deviation results in random matrix theory 

Let us first recall some basic facts in random matrix theory (RMT). A key 
object in RMT is the empirical spectral measure of a matrix A € l^n(C), 
namely the probability measure on M defined by 

1 

k=l 


where Ai(A),... , \n{A) denote the eigenvalues of A. 

It is well known (cf. [19]) that if X is a Wigner matrix, i.e. X G T-LniC) 
and the families of centred independent and identically distributed (i.i.d.) 
random variables iXjj)i<j<n, iXj^k)i<j<k<n are independent, and if the 
variance Var(Ai^ 2 ) = lE|Xi ^2 — IE(Xi^ 2 )P equals 1, then almost surely, the 
spectral measure yx/^/n converges weakly towards the semicircular distri¬ 
bution /igc, i.e. for any bounded continuous / : M —M, 

lim / fdyx/^ = / fd^sc ■ 
n^+oo J-^ 

The semicircular distribution psc is the probability measure on M defined by 
dpLsc{x) = 1 [_ 2 , 2 ] [x) dx . 

In the case of a sample covariance matrix, i.e. a matrix XX* with 
X G A4n,p(C) having centred i.i.d. entries, if Var(Ai^i) = 1, then almost 
surely, the spectral measure yxx */p converges weakly towards the Marcenko- 
Pastur distribution pmp.c with ratio c as n,p ^ +oo with ^ c G (0, +oo) 
(cf. [15]L This probability measure on M is defined by 


dyMV,c{x) = max 



^{be - x){x - Uc) , . 
^0 + 2tTXC l[ac,6c](^) 


with Oc = (1 — \/c)^ and 6c = (1 + \/c)^- 

For these two models in which the empirical spectral measure converges, 
we can investigate the speed of convergence and more particularly large de¬ 
viation principles. 
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We recall from [9] that a sequence of random variables (^n)n>i with 
values in a topological space {E, O) with u-Borel field B satisfies the large 
deviation principle (LDP) with speed v and rate function I in the topology 
O if 


• I : E ^ [0, +oo] is a lower semi-continuous function, i.e. the level set 
{x £ E \ I{x) <t} is closed for every t > 0, 

• u : N —>■ (0, -|-oo) admits a limit equal to -|-oo, 

• for all B £ B, 

— inf I{x) < liminf —— logP(Z„ £ B) 

a;£lnt(B) n —>+cxd V\Tl) 

< limsup —^ \og¥(Zn £ B) < — inf lix) 

n^+oo V{n) x&C\o{B) 

where Int(i?) and Clo(-B) denote resp. the interior and the closure of 
B. 

We also recall that the rate function I is said to be good if the level set 
{x £ E \ I{x) <t} is compact for every t >0. 

In 0, Ben Arous and Guionnet proved that if X is in the GUE, i.e. X is 
a Wigner matrix and Ai^i (resp. Xi^ 2 ) has law AA(0,1) (resp. N '2 (O, ^^ 2 )), 
then fJ-x/y/n satisfies a LDP in P(R) at speed with the rate function 

He) = ^ j dfJ-ix) - jj log \x - y\dHx)dHv) - ^ • 

This result was extended to LUE matrices, i.e. sample covariance matrices 
XX* where X has standard Gaussian entries, by Hiai and Petz (see |14jL 
Note that in fact, these two LDPs do not concern only Gaussian matrices 
but also more general unitarily invariant models. They strongly rely on the 
fact that for the considered models, the joint distribution of the eigenvalues 
has an explicit form, which is also the case in |12j . 

In [7], Bordenave and Caputo managed to obtain a LDP for Wigner ma¬ 
trices in another case, where the distribution of the Xj^k’s has sub-Gaussian 
tails. This is remarkable because here the joint distribution of the eigenval¬ 
ues is unknown. Let us recall their result. 

Definition 1.1. For a > 0 and a £ (0,-|-oo], we denote by Sa{a) the class 
of complex random variables Z such that 

lim —t~°‘logF{\Z\>t) = a (1) 

t^+00 
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and such that \Z\ and Zj\Z\ are independent for large values of \Z\, i.e. 
there exist Iq > 0 and a probability measure 'da on the unit circle such 
that for all t > Iq and all measurable sets C/ C we have 

P(z/|z| eu n \z\>t) = MU) p(|^| > t). 

In particular, a real random variable Z belongs to Saia) if it satishes dm 
and there exist to > 0 and a probability measure 19a on { — 1,1} such that 
for all t > to and all U C { — 1,1}, we have 

F{\Z\ > t n sign(Z) £U) = ^a{U)F{\Z\ > t). (2) 

Note that the hrst hypothesis implies that a random variable in Said) 
has finite moments of all orders. 

Theorem 1.2 (see O Theorem 1.1]). Let X be a Wigner matrix with Xi ^2 £ 
Saici) and Xi^i G 5^(6) for some a G (0,2) and a,b £ (0,+oo]. Then the 
spectral measure satisfies the LDP with speed and good rate 

function 

, \ _ / ^( 2 ^) if there exists v G ViF) such that p = Psc^t^ 

^ { +00 otherwise 

where $ : VlF) —)• [0, + 00 ] is a good rate function (see ^ for further details) 
and ffl denotes the free convolution (see Section El). 

Let us make a few remarks about this result. Roughly speaking, after 
random matrix considerations, the proof of Theorem ll.2l consists in proving a 
LDP for some random graphs associated to the Wigner matrix X. Therefore, 
the rate function <I> expresses as the supremum of functions of probability 
measures on graphs and it can not be computed in general. However, in 
some particular cases, it is possible to compute d>(i/). For example, if 1 / is a 
symmetric distribution on R, 5 < 00 and the support of i9b is { — 1,1}, then 
we have 

where ma{v) denotes the a-th moment of v. 

Theorem fL71 below will extend Theorem O to sample covariance ma¬ 
trices XX* with Xi^i G Sa{a) for some a G (0,2), a G (0,+ 00 ]. Note that 
to simplify, we will assume that X is a real random matrix. 

Let us mention here that LDPs for the top eigenvalue of Wigner matrices 
have also been obtained in Ben Arous and Guionnet’s setting, see m p- 81], 
and for the model introduced by Bordenave and Caputo in [2]. 
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1.2 Deformed matrix models 

After understanding the behaviour of the spectral measure of Wigner matri¬ 
ces or sample covariance matrices, the question of deformations of these mod¬ 
els has been investigated. Several types of deformations have been studied, 
the main ones being matrices of the type X+A with A G T-LniC) (additive de¬ 
formation), with E G T-Ln{C) definite positive (multiplicative 

deformation) or {X + A){X + A)* with A G A4n,p(C) (information-plus-noise 
model). 

A tool to study the spectral measure of a deformation is free probability, 
and more particularly free convolutions. Let us recall their definitions. 

Theorem 1.3 (see [IS]). Let A, B be two independent n x n Hermitian 
random matrices such that 

• either A or B is unitarily invariant, i.e. for M = A or B, for any 
unitary U G Aln(C), UMU* has the same law as M, 

• pLA and IXB converge weakly in probability to some distributions /ii and 
pL 2 on's, as n ^ -|-oo. 

Then, as n ^ -|-oo, the spectral measure ixa+b converges weakly in prob¬ 
ability to a deterministic distribution depending only on fxi and fX 2 - This 
distribution is called the free (additive) convolution of fxi and fX 2 , and is 
denoted by /xiS fX 2 - 

A similar result also exists for the singular values of the sum of two rect¬ 
angular matrices and it is due to Benaych-Georges. The empirical singular 
value distribution of a matrix A G Xin,p{C) is the probability measure on 
M+ defined by 

nAp 

A (^) ’ 

nAp ^ ' 

^ k=l 

where cri(A),..., cj„Ap(^) denote the singular values of A, i.e. the square 
roots of the eigenvalues of the positive matrix AA* (resp. A* A) if n < p 
(resp. n > p). 

Theorem 1.4 (see [5l Theorem 3.13]). Let A, B be two independent n x p 
random matrices such that 

• either A or B is bi-unitarily invariant, i.e. for M = A or B, for any 
unitary matrices U G Aln(C) and V G A4p{C), UM'V has the same 
law as M, 

• ixA and ub converge weakly in probability to some distributions pi and 
P 2 on 1R+ as n,p ^ -|-oo with ^ c G (0, -|-oo). 
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Then, as n ^ +oo, the singular value distribution va+b converges weakly 
in probability to a deterministic distribution depending only on /ii, /i 2 and 
c. This distribution is called the rectangular free convolution with ratio c of 
fii and ^ 2 , o.nd is denoted by ni fflc h 2 - 


Free convolutions can be characterized in terms of another key object in 
RMT, Stieltjes transform. For a probability measure // on M, we call the 
Stieltjes transform of /r the function : C \ M —)• C defined by 


Gf,iz) = [ d//(x) 

Jr z-x 


for all z G C\M. The following properties are obvious: 

1 


and 


\G>.iz)\ < 


\G^{z)-G^{z')\< 


Im2;| 


z — z 


Imzl.l Im z'\ 


We will use them implicitly in this paper. 

Note that the notion of Stieltjes transform is related to the resolvent one, 
since for a matrix A G TLniC), we have G^^{z) = ^ Tr((z/„ — ^)“^). Useful 
properties of resolvents we will use in this paper are gathered in Appendix 

mi 


Stieltjes transform allows to express subordination relations for free con¬ 
volutions. To state these relations, we need some additional notations. For 
/i G P(M), we denote by the distribution of when X has law jj,. 
Similarly, for p, G P(M+), we denote by y/Ji the symmetrization of the dis¬ 
tribution z/ of y/X when X has law p, i.e. the symmetric distribution on 
M defined by y/p{B) = borelians B. We have the fol¬ 

lowing subordination formulas, the hrst is due to Biane (cf. [6]) and the 
second is obtained from Dozier and Silverstein’s work m and a paper by 
Benaych-Georges (cf. [5]). 

Proposition 1.5. • Let p G P(K) and n = pS psc- We have 

G,.{z) = G^{z-G,{z)) . (3) 

• Let p G P(M+), c > 0 and u = [^/p fflc ^/hMP,c)‘^■ We have 

, = Op (^(1 - - (1 - c)(i - cajzj)) . (4) 

In Theorem 11.61 below, we are interested in the information-plus-noise 
model and we control the distance between the spectral measure and the 
corresponding rectangular free convolution, by bounding the difference be¬ 
tween the two terms in Q evaluated at the average Stieltjes transform. 
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1.3 Main results 


Note that in the rest of the paper, we will only consider real matrices for ease 
but our results should generalize to complex matrices adapting the proofs. 
The only difficulty in the complex case is to adapt the general integration 
by parts formula (I28p which is used several times in this paper, which would 
lead to heavier computations. 


Let us define, for s,t>0, the distance ds^t on T’{R.) by 

= sup \G^{z) - Guiz)\ , (5) 

zeVs,t 

where 

Vs^t = | 2 GC I Im 2 ;>s, 

As the distance d defined in [7], dg^t metrizes weak convergence. Let us 
mention that for all /U, i/ G T’(M), we have 

ds,tifJ-,i^) < min {dKsilJ^, i^)) ^ (V 

where dxs Wi are respectively the Kolmogorov-Smirnov and the L^- 
Wasserstein distances on P(]R). Some key inequalities for the distance be¬ 
tween two empirical spectral measures are summarized in Appendix IB. 31 
Our first main result is the following. 

Theorem 1.6. We assume that Cn = ^ is bounded below and above by 
two constants in (0,-|-oo). Let c > 0. There exist s,t > 0 and a constant 
Cg^t > 0 such that for any random matrix Y G Aln,p(K) with i.i.d. entries 
satisfying Var(Yi^i) = 1 and E(Yj^^) < -|-oo, for any deterministic matrix 
M G Aln,p(E), and for all n large enough, we have 


Re z 


Imz 


< t 


( 6 ) 


dg,t h{Y/^+M){Y/^+My, fflc ^/^J■MP,c) 


< C,,t E|yi,i|" + E(Yi,i ) ^ + 


4 \ f 1 Tr(MM*)i/2' 


n 


n 


, 1 Tr(MM0^/2 

+ |c„-c| + -+ 


where Y is the matrix whose entries are given by Yj^^ = ~ ^(Xj,k)- 


This result allows to understand the influence of the deformation in the 
information-plus-noise model. First, we can observe a decorrelation between 
the classical term and the Frobenius norm of the deformation divided 

\/n 


by a better power of n, namely 


n 


It is important for us to get this 
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precise estimate since in Section [3l we apply Theorem 11.61 to a matrix M 
whose Frobenius norm is not bounded but of order ^/nlogn. 

Besides, it is interesting to compare Theorem 11.61 to the Wigner case 
(cf. [71 Theorem 2.6]). Bordenave and Caputo investigated additive defor¬ 
mations and obtained that in this model, the distance between the spectral 
measure and the corresponding free additive convolution is bounded by . 
This bound is uniform in the deformation M and it depends on the initial 
matrix through its moments only. In the case of sample covariance matrices, 
it would have been surprising if we had obtained a better bound. Table [T] 
below permits to compare Bordenave and Caputo’s results with ours in the 
Gaussian and the general cases. In addition to this, let us mention that in 
[8] , the authors were interested in the case of Wigner matrices whose entries 
have a symmetric distribution satisfying a Poincare inequality, which leads 
to better bounds than [7]. 



Gaussian 

Non-Gaussian 

Wigner 

matrix 

Deformed CUE matrix 

1 

n 

Deformed Wigner matrix 

1 

Vn 

Covariance 

matrix 

Deformed LOE matrix 

1 , Tr(MM*)i/2 

n 

Info-plus-noise matrix 

1 ^ Tr(MM‘)V2 

y/n n 


Table 1: Bound in the subordination relation ([3]) or Q for different matrix 
models. 


Theorem 11.61 above will be used in the proof of our second main result. 

Theorem 1.7. Let X G Aln,p(]R) be a random matrix such that Cn = ^ ^ 
c G (0,+oo). We assume that Var(Xi^i) = 1 and that there exist a G (0,2) 
and a G (0, -|-oo] such that Xi^i G Saia). 

Then, the empirical spectral measure fJ-xx^/p satisfies the LDP with speed 
^i+a /2 governed by the good rate function J' defined by 

{ -^rnai 2 i’^) 'i'f ihere exists v G V{R+) s.t. /i = y'//MP,c)^ 

and i^({0}) > max (0,1 — 

-|-oo otherwise 

where mp{fa) = \x\p dp^^x) denotes the p-th moment of a distribution p,. 

It is very similar to Bordenave and Caputo’s result (see Theorem [TT]), 
the main difference being the explicit expression of the rate function in all 
cases. This is due to the fact that here, we can achieve large deviation ex¬ 
plicitly without using a LDP on graphs. 










The rest of the paper is organized as follows. In Section [2l we prove the 
bound for rectangular free convolution stated in Theorem 11.61 In Section 
[3l we prove the large deviation principle in Theorem 11.71 In Appendix O 
we state and prove concentration results used in Sections [2] and [3l Finally, 
in Appendix [B1 we snmmarize miscellaneous inequalities and identities used 
throughout the paper. 


2 Asymptotic freeness 


This section is devoted to the proof of Theorem 11.61 This theorem is in fact 
a conseqnence of the following, as we will see in Section 12.11 

Theorem 2.1 (Bound in subordination formula dH)). We assume that Cn = 
^ is bounded below and above by two constants in (0,+oo). Let c > 0. 
There exist s, t > 0 and a function f, bounded on the domain Vs^t defined 
by (W, such that for any random matrix Y E with i.i.d. entries 

satisfying Var(Yi^i) = 1 and E(yA) < +oo, for any deterministic matrix 
M E for all n large enough, and for all z E 14,t, we have 


\g{z) - (1 - cg{z))Gf,^^, {z(l - cg{z)f - (1 - c)(l - cg{z)))\ 

Tr(MM*)i/2\ 

n I 


< f{z) E|yi,i|" + E(yi,i ) ^ + 


^, 1 Tr(MM*)V2\ 

+ f{z)i\cn-c\ + -+ 


where g{z) = (^) and g{z) = E{g{z)). 

The proof of Theorem 12.11 follows the same lines as Bordenave and Ca- 
puto’s one for the bound in subordination formula ([3]) for free additive con¬ 
volution (see [3 Theorem A.l]). It consists in two main steps: the Gaus¬ 
sian case and the general case, which we deduce from the Gaussian case 
by interpolation. However, in the case of sample covariance matrices, the 
computations are heavier and some majorizations must be finer. 

Let us mention that in the Gaussian case, the bound consists only in the 
last terms (see Proposition 12.3p . 

In the proof, we define 

y 

X = — + M 
y/P 

and we denote by 

= {Zin - XX^)-^ 

the resolvent of XX^. We consider s > 2, t > 0, and along the proof, s can 
increase and t can decrease. Moreover, / will denote a bounded function on 
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which can also change from one line to another. In particular, for all 
z G Vs and x < y, since we have 


lm.z\y 





xl2 


we will write 


lm.z\y 


<fiz) 


as soon as x < y. 

Before starting the proofs, let us state a lemma we will use in the different 
steps. Bc{z, 5) denotes here the ball with centre z £ C and radius 5 > 0 for 
the usual distance in C. 


Lemma 2.2. For y, G P(M) and z gC, we define 

: {h, 7) (1 - 'yh)G^{z{l - 'yhf - (1 - 7)(1 - 7/1)). 

There exist s,t > 0 and I's t ^ (0; 1) such that 

• for all y G P(M), z G Vg^t, and 7 > 0 , is Lipschitz on 

Be ( 0 , i) with constant Ig^t, 

• for all y G P(M), 2 ; G Vg^t, and h G Be (O, i), 4>z,iJ.{h, .) is Lipschitz on 
(0,+ 00 ) with constant 

The proof of this lemma consists in simple computations and is left to 
the reader. Let us mention however that it relies on the inequality 

(cr - 1)((T^ - 2) 2t{a + l)\ \l - j\ 


Imr/I > luiz 


cj^(cr + 1) cr^ / a 


( 8 ) 


where y = z{l — 7 / 1 )^ — (1 — 7 )(1 — 7 / 1 ) and a = ^. We will use it again 
later. 

Furthermore, note that choosing a larger s and a smaller t, Ig^t and I'g ^ 
can be as close to 0 as wanted. 


2.1 Proof of Theorem 11.61 

First, let us deduce Theorem 11.61 from Theorem 12.11 

Proof. We define u = {s/yMM^^ Sc ^/yMpf)‘^ and we consider the function 
defined in Lemma 12.21 Subordination formula Q can be rewritten 
= Gu{z) for all 2 ; G C\M. Consequently, using Lemma 
there exist s,t > 0 and Ig^t £ (0) 1) such that for all 2 ; G Vg^t, 

\g{z)-G^{z)\ < \g{z) - (j)z,tz^^ti9{z),c)\ 

+ I (9{z ), c) - (j)z,,,^^t iG^{z),c)\ 

< \g{z) - (pz,^^J^J^ti9iz),c)\ + lg,t\giz) - Gu{z)\ 
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thus 


\g{z) - Gu{z)\ < 




{s{z),c)\ . 


From Theorem O in which we majorize / by a constant depending on s, t 
and from the definition ([5]) of dg^t, we finally get 



ds,t (lE/ixxT < Cs,t (e |Fi,i|^ + E(Yi_i 


□ 


2.2 The Gaussian case 

In this subsection, we assume that Ti^i is a standard Gaussian. Moreover, 


we will simply denote g{z) and g{z) by g and g respectively (see Theorem 


O for their dehnitions). We will prove the following bound. 

Proposition 2.3. There exist s,t > 0 and a function f, bounded on Vg^t, 
such that for any random matrixY G Ain,pi^) with i.i.d. standard Gaussian 
entries, for any deterministic matrix M G for all n large enough, 

and for all z G 14,t, we have 


g-il- cg)Gf,^^,{z{l - cgf - (1 - c)(l - cg))\ 



To prove Proposition 12.31 we will follow and improve some computations 
by Dumont et al, see [HI Appendix II]. 

Lemma 2.4 (adaptation from [T71 Formula (122)]). Let Y G A4n,p(E) be a 
random matrix with i.i.d. standard Gaussian entries, let M G A4n,p(E) be 
a deterministic matrix, and let z £ C\M. For all integer n, we have 


g-- Tr(R) = - Tt(AR) - % Tr(A) TrCE(S)R) 
— n n 


+ - Tr(A'i?) - ^ Tt(A') Tv{E{S)R) (9) 
n 


where 



( 10 ) 
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1 


A = 




p{l- Cng) y 


E Tr(5AM‘)5 + 


CnZ 


1 - Cng 


E{gS) 


+ 


( 


p{l - Cng)'^ y 


E gTriSXM^) E(5), (11) 


and 


A' = -— -- E(SXM^S) + -- E(zS^ - S) 

p{l- Cng) p[l- Cng) 

^ PHI-Cns)^ E(Tr(S^XM‘)) E(S). (12) 

In this lemma, we compare g to ^ Tt{R) because, using the notations 
in Lemma [2.21 we have ^ Tr(i?) = 4>z,iJ,;^^^t{9Xn), so ^ Tr(i?) is close to 
Lemma 12.21 That is interesting if we have in mind our 
goal, which is Proposition 12.31 

Note that, as m Formula (122)], the proof of Lemma 12.41 mainly relies 
on the Gaussian integration by parts formula (l27|) . so we do not give it here. 

However, we can observe an important difference between Formula (122) 
in m and Lemma 12.41 namely the terms in A'. In fact, the background 
here is not exactly the same as in m- Indeed, Vallet et al. consider com¬ 
plex Gaussian entries with independent real and imaginary parts having 
the same distribution in the matrix T, whereas we consider real Gaus¬ 
sian entries. Consequently, some simplifications do not occur any longer 
and a new term appears. Behind this phenomenon is the quantity C, = 
iLiq -|- 2iKi^2 — ^ 2 ^ 2 , where K denotes the covariance matrix of the Gaus¬ 
sian vector (ReLiq,ImYip). This quantity is equal to 0 in [T7j and to 1 
here, that is why we have an additional term. 


In the next lemma, we bound the different terms appearing in ([9]). For 
this, we will use the concentration bounds (j68p and (17011 for the terms in 
A and standard inequalities on traces and resolvents (see Propositions IB. II 
and IB.2p for the terms in A'. Our computations will partially follow those 
in [T7j. 

Lemma 2.5. There exist s,t > 0 and a function f, bounded on Vg^t, sueh 
that for all Y, M, n, and z as in Provosition \2.‘J\. we have 


g - TtR 

- n 




This lemma shows that ^ Tr(i2) is a deterministic equivalent to the Stielt- 

jes transform g{z) = ^ Tr(S') as soon as tends to 0 as re —>■ -|-oo, 

i.e. when the perturbation M is not too large. 
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We can compare this result with the bound obtained in m Proposi¬ 
tion 6]. Two main differences must be highlighted. First, as we mentioned 
above, the model is not exactly the same. Indeed, we consider real Gaussian 
entries and not complex Gaussian entries with independent real and imag¬ 
inary parts, which produces an additional term in A'. However, the terms 
in A are present in both cases, so we can compare the bounds for these 
terms. Here is the second difference. In m, the authors assume that ||M|| 
is uniformly bounded in n and get the bound Here, for the terms in 

A, we will get the bound 

^ f I , Tr(MM*)V2 



Moreover, if we use the bound (f6^ instead of dZO]) in the proof, and if 
we observe that Tr(MM*)^/^ < yTillMlI, then we get the bound + 

||M||), which is the same as in [T7] when ||M|| is uniformly bounded in n. 
Consequently, our bound has two advantages: it is slightly better than the 
bound in m and it applies without any assumption on M. 

Proof. First of all, let us remark that 


1 

1 - Cng 


</w 


since Iffl < Besides, we have 

\m < fiz) 


(13) 


because on the one hand, ^ is a resolvent evaluated at rj = z{l — CnQ^ ~ 
(1 — Cn)(l — Cng) so its operator norm is less than and on the other 

hand, we have the inequalities |1 — Cng\ < 1 -|- yq^^ and dH]) (we apply the 
latter with a = ^). 

By proposition IB. II (iii. it follows that 


-Tt(IE(5)i?) 

n 


Im z 


or just 


- Tt{E{S)R) 
n 




(14) 


Note that more precise bounds can be obtained, see m Appendix E]. 


13 

















Next, let us recall that A is defined by 


A = 


J' 

p(l- Cng) y 


E Tr(5AM‘)5 + 


CnZ 


+ 


1 - Cng 
Cn 


E{gS) 


p{l - Cng)'^ y 


E gTr{SXM^) E{S) 


and observe that Tr(S'AM*) = Tr(A*5M). 

The first term in ([9l) we bound is Tr(A) Tr(E(5)-R)|. First, using the 
concentration bounds (ESI), (ESI), and the Cauchy-Schwarz inequality, we get 


Tr 


n 


— --E Tr(SXM^)S 

^p{l - Cng) \ ^ 


-Tr 

n 


p{l - Cng) 


E [(Tr(5MA*) -ETr(S'MA*))(5-E5)] 


I - Cng 


■E 


-(Tr(A*SM) -ETr(A*5M))-(Tr(5) -ETr(5)) 
n n 


< 


< 


< 


|1 - Cng\ 


\ 1/2 


Var - Ti{X^SM) Var - Tr(5) 


n 




1 


\ 1/2 


n 




2 ^ Tr(MM*)i/2 

n^( 


(15) 


where u{z) and v{z) are defined in Proposition lA.ll 

Next, using the identity g = ^ Tr(S') and (f 68 ]l . we have 


Tr 


n 


CnZ 


1 - Cng 


E{gS) 


CnZ 


1 - Cng 


Var( 5 ) 


^ Cn\z\ 4:CnU{z) ^ f{z) 

~ |1 — Cng\ fi^ ~ 


where, for the last inequality, we used the definition of u{z) to get 

11 Cng\ 

The same arguments also allow to show that 


(16) 


-Tr 

n 


i 


E gTT{SXM^) E{S) 


p{l - Cng)"^ y 
Combining inequalities from (I14p to dm) gives 


< 2^i^Tr(MM*)^/^ (17) 

^17/8 




.Tr(A)Tr(E(5)i2) < f{z) 


1 Tr(MM*)^/2 




n 


17/8 


(18) 
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Computations are similar for the term ^Tr(Ai?), using the additional 
inequalities (fT3|) and < Y^||i2|| Isee Proposition IB.ll livll. For 

instance, we have 


^TrE [TviSXM^jsR 
1 


n 




-Tr 


n \p{l - Cng) 


E [(Tr{SXM^) - ETr{SXM^)){S - ES)R] 


< 


< 


< 


< 


\ 1/2 


-—-—- Var ( - Tt{X*SM)] Var f - Tr(SR)) 
|l-Cn5l \n J \n J 


|i - Cng\ 


n 


n 


5/2 


\ 1/2 


Cn f 9CnV(z) ^ ^ f 4:CnU(z) „ , 

-—-—r ^ Tr(MM*) / ’ f{i 

|l-Cn5lV 7 V 


1/2 


2 ^ TV(MM*) 1/2 . 


Combining with 


Tr 


n 


c„z 


1 - Cng 


msR) 


fiz) 


< 2 


and 


- Tr [ Cn E 
n \ 


(/Tr(5iM‘) j E{S)r] < TV(MM*)1/7 




which have a similar proof, we thus have 


- Tr(AR) 
n 


,, ,( I Tr(MM*)i/2' 

< /(^) + 




n 


17/8 


(19) 


We have bounded the terms in Lemma f2.4l in which A appears thanks to 
the concentration bounds proved in Appendix O We will now consider the 
terms in which A' appears, in other words the terms not present in [T7]. To 
this, we will only use inequalities on traces and resolvents (see Propositions 
IB.ll and IB.2h . Let us recall that A' is dehned by 


A' = 


1 


p(l - Cng) 


E{SXM*S) + 


1 


p(l - Cng) 


+ 


E{zS^ - S) 


1 


p 2 (l - Cnff)^ 


E(Tr(52AM*))E(5) 
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Using inequalities (i)-(iv) in Proposition IB.II and the resolvent identity 
SXX^ = zS — In, we get 

\Tt{SXM^S)\ 

< Tv{SXX^S*)^/‘^Tt{M^SS*M)^/^ 

< n^/'^/(z)Tr(MM*)^/2 (20) 


so 


-Tr 


n Vp(1 “ 
In addition, 

1 


- - -mSXM^S)] < 4^Tr(MM*)^/^ (21) 

- Cng) J 


-Tr 
n Vp(1 - Cng) 


¥.{zS^ - S) 


< 


< 


tU 


kl 1 
+ 


np|l —Cn 5 f| ylImzP |Imz 

f{z) 


n 


( 22 ) 


and using ([20]) again, 


Tr 


■E(Tr(5^XM*))E(5) 


< Tr(MM*)^/2 ^ (23) 


n yp^il-Cng)"^ 

Consequently, the combination of (fTl|) . ([2T]) . (1^ . and (|^ gives 

1 Tr(MM*)V 2 \ 


^Tr(A')TV(E(5)i2) < f{z) { -+ 




n 


n 


5/4 




(24) 


By very similar calculations, we get 
1 


n 


■ Tr(A'4?) 


J I Tr(MM0^/^ 


(25) 


Finally, combining relation ([9]) with inequalities m, (CHI), (isi), and 
(I25|), we get 


g-- Tr(i?) 
- n 




□ 


Finally, the Gaussian case f Proposition 12.3p follows from Lemma (2. 5 1 and 
the second part of Lemma 12.21 since we have 

^ Tt{R) = (1 - Cng)G^^^, {z{l - Cng)^ - (1 - Cn)(l - Cng )). 
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2.3 The general case 

We now only assume that Var(li^i) = 1 and that ]E(Y]^;^) < +oo. Let 
V G be an independent random matrix such that the L^',A:’s are 

i.i.d. standard Gaussians, we define X = -^ + M and for all u G [0,1], 
we define Y{u) = ^/uY + y/1 — uY, X{u) = + M, and S{u) = {zin — 

X{u)X{uY)~^. We have the following, which will allow us to bring back the 
general case to the Gaussian case. 

Proposition 2.6. There exist s,t > 0 and a function f, bounded on Vg^t, 
such that for any random matrix Y G with i.i.d. entries satisfying 

Var(Yi^i) = 1, < +oo, and E(Yi_i) = 0, for any deterministic matrix 

M G for all n large enough, and for all z G W,t; we have 


eg„„,(=)-eg„.„w 


<f(z) (E|i'i,i|=>+ £«*,)) 



Tr(MM*)V2\ 

n I 


Proof. The proof consists in four main steps. After developing E (z) — 

we use integration by parts formulas (see Lemma [T7|) . Then, 
we respectively focus on bounds for the main terms and the rests in these 
integrations by parts. 


First step: Development of 

Let u G [0.1] and h G [—u. 1 — v]. Proposition IB. 21 iiii. applied to A = X{u) 
and B = -^{Y{u + h) —Y{u)) gives 


Siu + h)-S{u) = 5(u + /i) X(u) 


Y{u + h)-Y{u) 


+ 


Vp 

Y{u + h)-Y{u) 


Vp 


-X{u) 


+ 


Y{u + h)- Y{u) (Y{u + h)- Y{u) 


Vp \ Vp 

Dividing by h and taking /i —>■ 0, we get for all u G [0,1], 

Y'{uf Y'{u) 


S{u) 


S'{u) = S{u) 


+ -^X{uy S{u) 


= ^S{u) 

Vp 


Vp Vp 

VuY 


+ 


+ 


y 


Vp Vp 

Y 


+ M 


Yt 


Yt 


2Vu 2VT^ 


u 


VuY^ VTXHY^ 


2Vu 2Vi - u I \ Vp 
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Thus we can rewrite 


= -TT5(1) --Tr5(0) 
n n 

1 

= - / TvS'(u)du 
n 


/o 


1 




2ny/p , 


+ 


Tr 


S{u) 


'1 — tt 


u 


yyi yyt 

2 _ 2 _ 

\/p \/V 

~\ yy* ^ 


1 - «/ 


+ 


'1 — tt 


rt 


u \ yy* 


1 - « / ^ 


My* My* yM* yM* 

-;-+ 


U y/u VT^ 


U 


du. 


Denoting by 

( 1 ) = Tr5(ii)2 


- 1 

7 

7 

l~[r~ yy* 

[Vp 

\l l-u ^ _ 


l<l<p 




and 


P)= E 4 S{u)l,YuYjj-J^^S{u)l,YuY,,, 


l<j,k<n 

l<l<p 


I 


l<j,/c<n ^ 
l</<p 

l</<p 

1 


1 — u 


u 


S{u)lkYk,iYj^i - S{u)lkYk,iY,^i 


l-U^, ,2 


u 


S{u)i,Yk,iY,,i - S{u)l,Yk,iY,j 


( 5 )= E -7^S{u)l,Mk,iY,j-^^S{u)l,Mk,iYj,i 

1^774. V*^ VI-u 


l<j,k<n 

l<l<p 


(6)= E -^S{u)l,Yk,iM^,i-^^S{u)l,n,iM^,i 

1^774. V*^ VI-w 


l<j,k<n 

l<l<p 
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where S{u)‘j must be read {S{u)‘^)j^k, we finally rewrite 

^I^xxt ^ 2ny/p j ~*~ ~*~ ~*~ ~*~ ' 

^ ° (26) 


Second step: Integrations by parts. 

Let us recall the formulas we will use below. 

Lemma 2.7 (see [ISl Formulas (2.1.39) and (18.1.19)]). (i) Let a function 

F E C^(M,M) and ^ a random variable with distribution Af{0,a‘^). If 
E|F'(^)| < + 00 , then 

E(F(00 = t^'E(F'(e)) . (27) 


(a) More generally, let p be an integer, a funetion F € C^^^(M,]R), and a 
real random variable IfM < +oo and the derivatives F ',..., 

are bounded on M, then 

IE(^F(0) = j;^E(F(^')(e)) +ep (28) 

j=o 


where the Kj+i ’s are the cumulants of the distribution of ^ and 


epi <CpE|er2.||F(^’+i)||oo 


, ^ 1 + (3 + 2p)P+^ 

(p+1)! 


We will apply the Gaussian (|27|) or the general (l28|) integration by parts 
formula for all j, k, I in order to decompose E[(l) + (2) + (3) + (4) + (5) + (6)] 
as a sum of terms that we can bound. 

Note a first crucial point here. As we want to apply Theorem 11.61 to 
the matrices Y and C in Section [3] in order to obtain O, it will not be 
sufficient to use the integration by parts formula up to order 2, that is why 
we will be interested in terms of order 3 in this formula. 

From now, Da^b denotes the derivation with respect to Ya^b- 


Let u E [0,1], j,k E Il,u]|, and I E Il,p]. We denote by Fi and Gi the 
functions defined by FifYj^i) = Yk^iS{u)‘jj^ and Gi{Yj^i) = Yk^iS{uf-f^. We 
have 





Yk,iS{u)j^k- 


Dj,iS{u)j^k + Sj^kS{u) 


2 

k,k ’ 


2u 


Fi{Yj,i) = —Yk,i {{D^^iS{u)pkf + S{u)pk-DliS{u)j,k) 


P 


T djkS{u^k,k-FklS{%L)k,k 

v/p 
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(3Dj,iS(u),-k-DfjS(u),-k + S(u),-k-I^liS(u)j,,) 


2^3/2 


P' 


3/2 


and 


Gu 

+ ((^k,iS(u)k,k)^ + S{u)k,k-Dl iS{u)k,k) 




Applied conditionally to the variables {Ya^b, 1 < a < n, 1 < b < p} L) 
{ya,b, ia,b) / ij,l)}, gives 

%,i{s{u)l,Yk,iY,,i) = var(y,- 0 % KGUy,, 0 ), 


where denotes the associated conditional expectation. Similarly, from 
), we have 


E,,i{S{u)l,Yk,iY^^i) = Yav{Y^^i)E^^i{F[{Yj^i)) + 


^3(1^7) 




where Ej^i denotes the expectation conditionally to the variables {Ya^b, («, b) / 
(j, 1)} U {Yafi, 1 <a<n, l<b<p}. 

Taking the expectation, we thus have 


E 


S{u)l,Yk,iYj,i - 


u 


1 — u 


S{u)l,Yk,iY,,i 


= Var(y,-z)E(T{(y,-z)) + E(F"(y,-,)) + E(£i,,-^,0 


u 


1 — u 
2 


Var(y,v)E(G;(y,-z)) 


= 5,-E(5(n)^,,) + E(Ff (y,,0) + E(ei,,, 


with 
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Dividing by y/p and summing on j,k, I, we thus have 

' ^ _. 

( 1 . 2 ) 


+ E nyk,iS{u),,k.DiiS{u),,k) 

P A U ] 


j,k,l 


(1.3) 


+ 


-y^E(g(ll)fc fc.Dfc fc) 

^ kl 


(1.4) 




(29) 


j,k,l 


Since S{uy = 5(u), we also have 


E(2) — (1.1) + (1.2) + (1.3) + (1.4) H—— E(e2j,fc,i) 


with 


\^ 2 ,j,k,l\ < 


1 + 7 ^ 




IE(l^M)-l|i"i®lloc 


(30) 


Similarly, considering F^{Yk^i) = Yj^iS{u)'j f,, we get 
E(3) = -VpE(Tr5(n)2) + ^^EMl:/piEE^E(y,-z(Dfc,z5(ii),-fe)2) 


(3.1) 


P' 


j,k,l 


(3.2) 


, «3(Ei,i)Vw(1 - U) , ^2 or.O ^ 

+ 3^2 / ^ E(b},;<S(tt)j,fc.Dfcj<S(tt)j^fc) 

P I, ; 


j,k,l 


(3.3) 

E(4) = (3.1) + (3.2) + (3.3) H—— ^^^{£4.j,k,i) 


(31) 

(32) 


with 


k3j,Ml<^^E(y4i).||F3®|U and < ^^E{Y,^^,).\\fP \ 


6 


21 



for all j,k,l, and considering F^{Yj^i) = Mk^iS{u)‘^- j,, 

^ j,k,l 

V 

(5.1) 

^ j,k,l 

'-V-' 

(5.2) 

j,k,l 

E(6) = (5.1) + (5.2) + y~^ K{sej^k,i) 
j,k,l 

with 


(33) 

(34) 


l^5,j,fc,z| ^ 


1 + 7“^ 


E(>\"i)-I 


d3)| 


and 


l^ 6 ,j,fc,z| ^ 


1 + 7^ 


E(na)-ll^t 


(3), 


for all j, k, 1. 

We have thus rewritten 




n^/p. 


E[(1.2) + (1.3) + (1.4) + (3.2) + (3.3) + (5.1) + (5.2)] du 


+ 


2ny/p 


E 


X] X] ^kj,k,l + ^ij,k,l 

j,k,l\'^^i=l i=5 2 


du. 


Third step: Bounds for the main terms. 

We will develop the different terms in this expression with the differentiation 
formulas in Proposition IB.21 Iviil. and bound them thanks to inequalities on 
traces and resolvents (see Propositions IB.II and IB.21 again). 

Because some computations are very similar, we will be interested in the 
terms (1.2), (1.3), and (1.4) only. 

Note that in order to simplify the notations, from now, we will denote 
S and X for S{u) and X{u). 
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Let us start with the term (1.2). Using (IHH) and (l88]l . we have 




j,k,l 


[Yk,iiSX)liSl, + 2Yk,i{SX),^iSj,kSjjiSX)k,i + Yk,iSl{SX)li 

j,k,l 


= E 


TV(y+ 2 Tr((y o SX)X^S diag(5)5) 


+ EsJ,rEU,,(SA')i,, 

j k,l 


where o is the Hadamard product (see Appendix IB. ip and 3°“^ denotes SoS. 
Note that it is crucial here to rewrite precisely the terms with the Hadamard 
product and then to bound the traces rather than bound directly the entries. 
Indeed, it allows us to get better powers of n in the bound, which is crucial 
if we have in mind the large deviations in Section [3l 

Using Propositions IB.ll IB. 21 and the Cauchy-Schwarz inequality in 
and denoting by y a square root of z, we have 


Tr{Y{X^S)°^S°^)\ 


< ^\\X*Sf.\\SfTT{YY^f/^ 

^ u^ ^ Tr(yy*)i /2 ^ 


I ImypI Imzp 


Th((y o5X)A*5diag(5)5)| 

< v^||X*5||.|| diag(5)||.||5|| Tr((y o SX)(Y o SX)*)^/^ 

Tr(yy ‘)^/2 ^ 


< 


I ImyPI Imz| 


and 


E-5lrEU,(SA)J,, 


< 


< 


n 


I Imzpl Imy| 


Y,\Yk,i{SX),,i 


k,l 


1/2 


n 

I Imzpl Imyl 


YKi 


k,l 


Y.\^SX)k,i 

k,l 


< 


„^(j,y.)l/2 1 

nTr(yy*)V2 / ^1^ 


+ 


Imzp|Imy| \|Imzp \lmz 


1/2 


1/2 
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Using also the bound ([90]), there exists a function /, bounded on Vg^t, inde¬ 
pendent from Y, M, and n, such that for all z G we have 

|(1.2)|<|K3(Ui,i)|/(z)E(Tr(yy*)V2). 


But for a centred random variable, the third cumulant equals the third 
moment, so this inequality can be rewritten 

We adopt the same strategy for the term (1.3). We have 

Y,^iyk,iS{u)j,k.DliS{u)j,k) 


j,k,l 


'^nYk,iSj,k.2{SjjSj,k + {SX)liSj,k 

j,k,l 

+ Sjj{X^SX\iSj^k + 2SjjiSX),^iiSX)k,i)] 


= 2E 


+ Tr(y(X*5)°25°2) 


+ + 2Tr((y o 5X)y*5diag(5)5) 

hi 


so, using the previous bounds, and also 


3,1 


< 


\ 3,1 

Tr(s°2yy‘(s°2)*)V2 

y^ll ,_\ ^/2 


1/2 


< 


< 


I Im z 
^/rep 


n-, 


■ Tr(yy 


:)l/b 


Im z| V11™ ' J 

7j3/4pl/2 


ImzP 


Tr(yy*)V 2 


and 

YSjAs°^y)3,ii^'sx\i 

3,1 


< 


Im z\ 


1 + 


Im z 


n^/^pi /2 


Imz 




Tr(yy*)V 2 ^ 


1/2 
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the same arguments as above lead to 

|(1.3)| < ^|^/(z)IE(Tr(yy*)V2). 

Besides, we have 

Y,nS{u)k,k.Dk,iS{u)k,k) = Y,E[Sk,k.2Sk,k{SX)k,i] 

k,l k,l 

and 

\ 1/2 

k,l / 

= , Tv{SXX^S*y/^ 

\ Im 2:|2 

^ -^/np / n\z\ n \ 

“ limzp ylimzp limply 

thus we get 

|(i.4)| <E|yi,i|3/(^)V^. 

We finally have 

1(1.2) + (1.3) +(1.4)1 <E|yi,i|3/(^)(E(Tr(yy*)i/2) + ^^ . (35) 

Very similar computations allow to show that 

1(3.2) + (3.3)1 < E|yij|V(2)E(Tr(yy‘)^/2) 

and 

1(5.1) + (5.2)1 <E|yi,i|V(-2)\/nTr(MM*)i/2_ (3g) 

If we remember that Wp and yi^i have mean zero and variance 1, we have 
E(Tr(yy^))^/^ < ^ynp and E(Tr(yy*))^/^ < ^ynp by Jensen’s inequality. 
Finally, we can write 

1(1.2) + (1.3) + (1.4) + (3.2) + (3.3) + (5.1) + (5.2)| 

<E\Yl^lff{z)(n + ^/nTT{MM^)^/^'^ . (37) 




k,l 


k,l 


< 


Im z\ 




Fourth step: Bounds for the rests. 

The only thing to be left is to bound the rests appeared in the integration 
by parts formulas. We recall that for all j,k G |l,n]], I G [1,^1, we have 

\ei,j,k,i\ < i^E(yi"i).||F®|U. 
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/Q\ 

Using the expression of {Yj^i), differentiation formulas (fMl) . (I88l) . (I89l) . 
and inequalities (iv)-(vi) in Proposition IB.21 there exists a function /, inde¬ 
pendent from Y, M, n,j, k, I, bounded on Vs^t, such that 

< f{z)E{Y,\) i^\Yk,i\ + -6, A ■ 

So, using the Cauchy-Schwarz inequality in we have 


1 

y/P 

< f{z) E(y4i) (E(Tr(yy*)V2) + 

< f{z)E{Yl,)n. 


j,k,i 


< /(z)E(y4i) (ij^E|yfc,;| + V^ 
k,l 


(38) 


The same bound holds for 


1 

vT 




Similarly, we get 


1 

y/P 


j,k,l 


</(z)E(y4i)n 


and 


^ ^ ^5,j,k,l T ^6,j,k,l 
j,k,l 


< /(z)E(y]^i)V^Th(MM*)i/^ 


Finally, combining relations from (|26p to (j40l) . we get 


(39) 


(40) 


<f{z) (E|yi,i|3+E(yi"i)) 


1 Tr(MM*)V2\ 
n n 1 


(41) 

□ 


We can now conclude the proof of the general case and obtain Theorem 
O In fact, in Proposition 12.61 we assumed that E(yip) = 0, so we only 
have to remove this assumption. 

Proof. We recall that X = X — E(y) by definition. We also define 
g{z) = EG^^^^iz), go{z) = EG^^^^{z), and g{z) = EGf,^^^{z). Using 
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the notations in Lemma EH we have 


Igiz) - (1 - cg{z))Gf,^^, {z{l - cgiz)f - (1 - c)(l - cg{z)))\ 

< \giz)-goiz)\ + \go{z)-g{z)\ + \g{z) - ^^^f,^^,{g{z),c)\ 

+ I i 9 iz),c)- {go{z),c)\ 

+ I (9o {z),c)- {g{z) , c) I 

< (1 + ls,t) \g{z) - go{z)\ + (1 + ls,t) \go{z) - g{z)\ 

+ \g{z) - (t>z,i,^^t{g{z),c)\ 


for s large enough and t small enough by Lemma 12.21 

Since the matrix X — X = E(X) has rank at most 1, using the relations ([5]) . 

m, 3)11(1 (j92p , W6 ll3V6 


(z) < ds,t < dKS < - • 




Proposition 12.31 (the Gaussian case) applied to Y and Proposition 12.61 (the 
centred case) applied to Y permit to get finally 


\giz) - (1 - cg{z))Gf,^^, {z{l - cg{z)f - (1 - c)(l - cg{z)))\ 



1 Tr(MM^)i/2 


)) 


—j= H- 

Vn n 



□ 


3 Large deviations 

This section is devoted to the proof of Theorem 11.71 In this section, X E 
At„^p(M) is a random matrix such that c„ = ^ ^ c E (0, +oo). Moreover, we 
assume that Var(Xi^i) = 1 and that there exist a E (0,2) and a E (0,+oo] 
such that Xi^i E Sa{a) (see Definition II.ip . 

We define 


and we decompose the matrix X as 


—— — A B C + D, 

y/P 


(42) 
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where A, B, C, D are the matrices defined by 


Aj^k — 
Cj^k — 


X 


j,k 


■^j,k 




^ -*-£(n)VP<|^i,4|<£(n) ^y/p -J."- ^ 

Besides, we denote by 5) the ball with centre /r G P(M) and radius 

(5 > 0 for the distance dg^t- 


3 k 


Dj,k — 




j,k 




3.1 Exponential equivalences 

The goal of this subsection is to prove the following. 

Proposition 3.1. There exist s^t > 0 such that the random distributions 
Txx*/p o.nd {y/pcc* fflc y/hMP,c) (kre dg^t-exponentially equivalent at scale 
^i+a /2 ^ ^ + 00 , i.e. for all 6 > 0, we have 

^ {Vi^ fflc >s)=-oo. 

The strategy to prove Proposition 13.11 is similar to the one in [7]. First, 
we explain why the contributions of B and D for large deviations can 
be neglected (Lemmas 13.21 and 13.3p and then, we show that the measures 
A{A+c){A+cy [y/pcc^ fflc y/TMP,c)‘^ are exponentially equivalent thanks 
to a conditioning and a coupling argument in which several tools are needed, 
such as the concentration property (j82p and the asymptotic freeness result 
stated in Theorem 11.61 From now on, we consider s > 2 and t > 0. 

First, the contribution of D is negligible. 

Lemma 3.2. Hxx^p T{A+B+c)(A+B+cy eire exponentially equivalent. 

The proof is very similar to what is done in [7] , the only difference being 
the use of (f92p instead of (|9Tp . Therefore, it will not be repeated here. 

The contribution of B is also negligible. 

Lemma 3.3. Pxx^/p et A[a+C){A+cY eire exponentially equivalent. 

Proof. From Lemma 13.21 the triangle inequality, Lemma 1.2.15 in [9], and 
the inequality dg^t < kki < W 2 , it is sufficient to prove that for all 5 > 0, 

nl+“/2 (^2 {liiA+B+C)iA+B+cy,h(A+C)iA+cy) > 6) = -OO . 

From iH), which is the analogue of the Hoffman-Wielandt inequality (|93p 
for covariance matrices, it is sufficient to check that for all 5 > 0, 

® ® 

+ {A + C){A + Cf) Tr(BB*) > 5 
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Let <5 > 0. We have 


Tt((^ + C){A + Cf) < TV((A + B + C)(A + B + CY) <Ti(-XX^ 

\P 

using the decomposition (|l2]) . Thus, 

P ( 4 Tr((A + B + C){A + B + CY + {A + C){A + Cf) Tr(SS^) > <5 




< 

< 


n^p 


Tt{XXYTi{BBY > 5 


— Tr(XXM > E(X?,) + 5 ) 

\ 

\n ^ ’-nxlp) + 5) 

(43) 

On the one hand, since TA{XXY is the sum of np i.i.d. random variables, 
from Cramer’s theorem in M (see [9l Theorem 2.2.3]), we have 

lim — logP ( — Tr(XX*) > E(X?0+ 5 ) 
n^+oo np \np ’ J 

= - sup (e{E{XfY + '^) - logE(e'^^i’i)) < 0 

flclB V ’ / 


SO, since a < 2, 


1 


lim —^ logP ( — Tr{XXY > E(Xfi) + <5 ) = -oo . (44) 

rn>+oo ni+“/2 \np ’ J 

On the other hand, since ^ ^ c, the same arguments as in [7] lead to 


nT+oc ni+“/2 ( n - E(X2^) + 5 


lim 


= —oo. 


(45) 


Finally, combining ([43|) . ([411) . (05]), and Lemma 1.2.15 in [9|, we get the 
exponential equivalence of fJ-xxyp p(^A+c){A+cy ■ El 

Before proving Proposition 13.11 we need some additional properties. 

Lemma 3.4. (i) We have 

1 


lim —4-—logP ( — Tr(C'C'*) > (logn)^ 
n^+oo ni+"/2 \n \ ^ ! 


= —oo. 


{%%} Defining I = {{j,k) \ \Xj^k\ > (logn)^/"}, for all 6 > 0, we have 

(I'l - 
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(in) We denote by Pn the distribution of Xi^i conditionally to < 

(logn)^/“}. Let Zn be a random variable with distribution Pn- There 
exists C > 0 sueh that 

supmax(E(z2),(IE(z2))2,E(z4)) <C. 

nEN 

Furthermore, the variance of Zn, denoted by a^, tends to Var(Xi^i) = 
1 os n —>■ +00 and more precisely, there exists p > 0 sueh that 


Proof. The proofs of (i) and (ii) exactly follow the proof of Lemma 2.4 in 
[7]. Therefore, we will only prove (iii). 

Let Zn be a random variable with distribution Pn defined as above. We 
have 


E(z2)=E(x2i||Xi,i|<(logn)2/“) 


^ (^1,1 l|Xi,i|<(logn)2/°i 

P(|Xi,i| < (logn)2/«) 


But thanks to hypothesis o, Xh is integrable, so by the dominated con¬ 
vergence theorem, E l|Xi i|<(iogn)2/“) tends to E(X^^) as n —-|-oo. 

Besides, P(|Xi^i| < (logn)^/") tends to 1, so E(Z^) tends to E(X^]^) as 
n —-|-oo. 

The same arguments show that E(Z^) tends to E(X^;^) as n —>■ -|-oo. We 
can deduce that there exists a real number Q such that 


supmax(E(z2),(E(z2))2,E(Z^)) <C. 

nEN 


Moreover, we have 

= Var(Xi,i | |Xi,i| < (logn)^/“) 

_ E (^1,1 l|Xi,i|<(logn)2/“) /lE 

P(|Xi,i| < (logn)2/“) I P(|Xi,i| < (logn)2/“) 

Using similar arguments, we prove that tends to Var(Xi^i) = 1 as n —>■ 
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+ 00 . More precisely, we can write 


|Xi,l|<(logn)2/“^ \ 

< (logn)^/“) I 

-E(x2,) + (E(Xi,i))2 

E(^?,il|XMl<{iogn)2/'^) < (logn)2/“) 

E(|Xi,i| < (logn)2/“) 

E(Xi,i)2p(|Xi,i| < (logn)2/“)'-E(Xi,il|;,^^^|<(i„g„)2/.)' 

P(|Xi,i| < (logn)2/“)^ 

E(X2,)P(|Xi,i| > (logn)2/“) 

E(|Xi,i| < (logn)2/“) 

E(Xi,i)2 (p (|Xi,i| < (logn)2/“)' - l) 

^ P(|Xi,i| < (logn)2/«)2 

2E(Xi,i)E l|Xi,i|>(iogn)2/<^) 

P(|Xi,il < (logn)2/«)^ 

IE (^1,1 l|Xi,i|>(logn)2/-) 

-^^ (46) 

P(|Xi,i| < (logn)2/“)^ 

where E l|Xi,i|<{iogn)2/“) = (lE(-^i,i) - E 4|Xi,i|>(iogn)2/'^)) 

was used to get the last equality. 

From hypothesis ([T]), for n large enough, we have 

P(|^i,i| > (logn)2/") < . 

Besides, 


P (^1,1 l|Xi,i|<(logn)2/«) /E 1 

P(|Xi,i| < (logn)2/«) y P(|Xi,i| 


P(|Xi,i| < (logn)2/“)2-l 

= -P(|^i,i| > (logn)2/“) (p(|Xi,i| < (logn)2/“) + l) 
and by the Cauchy-Schwarz inequality, we have 

E (Xi,i l|Xi,P>(logn)2/^) < E(X2i)V2p(|Xi^i| > (logn)2/“)V2 


and 


E 


(vL l|x,. 


|>(logri)^/° 


> (iogn)2/“)V2, 
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Going back to (I46p . we have for n large enough 


al-l\< 2E(Xi2i)e-t(i°g«)" +2E(X^^i)^/2g-t(iogn)2 

+ 4E(Xi,i)2e-t(^°s")" +4|E(Xi,i)|E(Xi2i)^/2g-f(iogn)2 

+ 2E(Xl2l)e-t('°g”)^ 


Because the moments of Xi^i are finite, we can deduce that there exists a 
real number rj such that 

Wn-M < 


□ 


We can now prove Proposition 13.11 

Proof. The proof relies on a conditioning with respect to the entries of X 
which are not in A and on a coupling argument to remove the dependency 
between A and C. 

We use here the same notations as [7] . We denote by X the u-algebra 
T = a fc l|x,^;,|>{iogn)2/“} > 

Pj- and Ej- the probability and the expectation conditionally to J-, and we 
denote by E and F the events 

E= |iTV(CC*) < (logn)2| 

and 

F = {|/| < n^+“/2| ^ 

with I = {{j,k) I \Xj^k\ > (logn)^/"}. Thus, the matrix C is J^-measurable 
and the events E and F belong to E. Moreover, from Lemma 18.41 (i)-(ii), 
we have 

iia /2 = -oo and liin ^ logP(F^) = -oo. 

n—^+oo n—>-+oo 77,-*^“*"“/ 

(47) 

Besides, conditionally to E, ^/pA is a random matrix with independent 
entries bounded by (logn)^/“. From the concentration result (I82p applied 
to y = ^/pA, M = C, K = (logn)^/“, from the inequality dg^t < fPi; and 
using that a < 2, we get for all 5 > 0 and n large enough, 

l£;PjF (ds.t {P(A+C){A+cy P^{A+C)(A+Cy) > 

^ /3(logn)^/" / Tn?5^ \ 

- 537 ^ V /?(logn)8 /“) 
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hence 


1 


ni+“/2 


logP {E n {ds^t {^^{A+c){A+cy,'^Ti^{A+c)iA+cy) > <^}) = -oo ■ 

(48) 


We will now use a coupling argument. We consider an independent 
random matrix Y whose entries are i.i.d. with distribution defined in 
Lemma and we denote by A' the matrix defined by 

= '^{j,k )0 ^j,k + l(i,fc)e/ ■ 


Vp 


Consequently, y/pA' and Y have the same distribution and are independent 
from E. In particular, we will use later that for all bounded continuous /, 
we have E^(/(y)) = E{f{Y)). 

From the inequalities (IMIl and ds,t < W 2 , we have 

ds,t{P^{A+C)iA+cy E{A'+C)iA'+cy)'^ 

< ^ W((^ - A')(A - A'Y) Tr((A + C){A + Cf + (W + C)(W + Cf) 


{X] ^u,k)ei ^ ) I + Cj^kf + {^j,k + 


3,k 


j,k 


:^\ X] 1 I + ‘^^Ik + i^'j,k 




j,k 


y2 
^ j,k 


s; E E] \YAk + ^Mcc‘) + Y,Ay+ y, -f 


n^p 

2 

< — 
n^p 




j,k 


j,k 


{3,k)&I 


p 


2 np 


(logn)'^/'^ 

p 


+ TKCC‘)] I E Ed+;( E Y 


With definition ([5|) of dg^t and conditional Jensen’s inequality for the concave 
function x 1 —>■ we thus have 

1e lFds,t P{A+C){A+cy^'^T P{A'+C){A'+cy) 

< 1e lF^Tds,t {P{A+C){A+cyE{A'+C){A'+cy) 


< 


21e1 


E J-F 


n'^p 


2(n(logn)"/“ + Tr(CC‘))E^ E/, 

\{3,k)el 
1/4 


+ -E^ 
p 


/ 

\ 

y 

E 



i3,k)&I 

/ 
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because 1e, If, and Tr(C'C'*) are J^-measurable. Since the events G 

/} are J^-measurable and Y is independent from we have 


Ef 


( Hhk)&I Yk 
\ kk 


J2^u,k)^imlk) = \imyli)<c\i\ 

j,k 


from Lemma 13.41 (hi), and similarly 




E 


^{j,k)£l 




So we have 


If ^Fds,t (EF^(A+C')(yl+C)‘,EF/^(A'+C)(A'+C')*) 


< 


< 


< 


f2(n(logn)4/“ + Tr(CC*))|/| + Vi' 

n^p y p 

^2 ^n(logn)^/" + n(logn)^^ j^i+«/2 _|_ 

‘)tr 1 

—^.3n(log 



(eCcn)^/'^ 


(log n)^/“ 

nV4-«/8 


for n large enough (we used here the fact that — > 2). It follows that for all 

(5 > 0 , 


lim ^ log P (Li n F 

n^+oo nl+“/2 

n {ds^t (EfM(a+c)(a+c)*,EfM(A'+c)(A'+c)*) > '^}) = “OO. (49) 

In addition, we define = Var(yi^i) as in Lemma [33] (iii). Since C is 
F-measurable, Y is independent from Y, and ^EF(h)^i) < 2^ < +oo for 
n large enough, we can apply Theorem 11.61 to Y/an and C, conditionally to 
Y. Therefore, for n large enough, s large enough, and t small enough, we 
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have 


y ,^\{y , {Vf^cc^ ®c V/^MP,c)‘ 


/, , 1 Tr(CC*)i/2\ 

+ Cs i lij Cn — c H-1- 

\ n 


— ^s,t 


) 

( 8(logn)®/" 16(logn)®/"^ f 1 logn 

Iv ^ 


+ 


n ^Jn 


1 log n 

+ Cs t I |Cn ~ C H-1- 

' n n'*/4 


using Jensen’s inequality and the fact that for all j G |l,u]] and A; G [l,pl, 
we have |l^',fc| = |h^',fc — E_ 7 r(Y^'^fc)| < 2(logn)^/“. Therefore, for all J > 0, 


1 


li™ 1 i /o 

n^+oo 77,J-+“/^ 


log P (-E n 


4,t ic)'’ 


(50) 


To finish, from (19411 . we have 




1 Vyy^' 


(ynj p 


X Tr 


.Vp 


+ c 


Y 


+ c] + 


Y 


+ C 


^ny/P J \Crny/P 


Y 


+ C 


2 

< — 


1 - 


n^p \ ar. 


wv‘)|j:(%+Ci,d + 


j,lc 


\Vp 


Yj,k 

(^ny/P 


+ Cj^k 


2 

< — 
n^p 


1 - Tr(yy^) (^ att { cc ^) + ^ (^1 + Tr(yy^)^ 
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so, using conditional Jensen’s inequality and doing as above, we get 

-^(l-—) f4n(logn)2E^Tr(yy') + -(l + ^) E^(Tt(yy*)2)^ 

n^p V o-n/ V P\ ^ij ) 


1/4 


< 


2 / cjH - 1 \ V . .2 ^ 2 (. \ 


n^P \crn{crn + 1 ) 


4n(logn)"^.npC H— 1 H—n -n^p^C 


- 1 

cj„(cr„ + 1) 


1/2 


8C(logn)^ + 4C ( 1 + 


P \ CTr. 

1/4 


1 1/4 


crt 


By Lemma 13.41 (iii), we deduce from it that for all J > 0, 


t I > J M = —oo . 

(51) 


To conclude, combining equalities from (j47p to m, Lemma I3.31 and 
Lemma 1.2.15 in [9], for s large enough and t small enough, we have for all 
<5 > 0, 

nilToo >s)=-oo. 


□ 


3.2 Large deviations for jj^c' 

In the previous subsection, we proved that /p {y/Pcc^ y/PMP,c)‘^ 
are exponentially equivalent. Consequently, to obtain the large deviations of 
Pxx^/p iTheorem ll.Tp . it is sufficient to study the large deviations of //cc* 
and to apply the contraction principle (see [9l Theorem 4.2.1]). For this, in 
this subsection, we will study the large deviations of 


c 


0 

c 


0 


and prove the following, from which we will deduce the large deviations of 
thanks to identity (l5H) and conclude in the next subsection. 

Proposition 3.5. The measure nc' satisfies the LDP with speed in 

V (M), for weak topology and good rate function 4)' defined by 


$'(/r) 


f ciW 2 ma{p) if P is symmetric and /x({0}) > 
+00 otherwise 


(52) 
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where niaifJ-) = J^\x\°‘ dfi{x) denotes the a-th moment of fi. 

Note that is a good rate function because it is well known that for all 
m > 0 and p > 0, the set 


Kp,m — s p ^ P(M) 


\xf’ dfj,{x) < m 


(53) 


is compact for the weak topology. Moreover, the domain of can be ex¬ 
plained thanks to Lemma 13.61 (i). 

Lemma 3.6. Let M E and 


M' = 


0 

M 


0 


(i) The distribution pM' is symmetric and PM'i{0}) > 4+^- 
(a) We have 

2 _ ^ 1 — Cn ^ 

hM' — + T~ OQ . (54) 

C-n + 1 1 “T Cyi 

(Hi) If M is diagonal, in the sense that only the entries Mjj, 1 < j < nf\p, 
can be non-zero, then 


Pm' 


nAp I , 

—^ ^-^3,3) + “TT-'^0 ■ 

n-\-p^-^ l-\-Cn 

f=i 


(55) 


The proof of this lemma does not present any difficulty and is left to the 
reader. We also need a second lemma, which consists in two estimates for 
the distribution of Xi 1 . These estimates come from the particular form of 
this distribution, see hypotheses m and m- 

Lemma 3.7. (i) There exists a sequenee {r]n)neN converging to 0 sueh 

that for all x > e{n), we have 

P(|Xi,i| > x^) < . (56) 


(ii) We denote by Sa the support of the distribution Da defined by m- 
There exists a sequence (a„)neN converging to a such that for all x gM. 
satisfying |x| > s{n) and sign(x) E Sa, for all 7 > 0, and for all n 
large enough, we have 

P E (x - 7 , X + 7 )^ > . (57) 
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The computations to get these inequalities are explained in m p. 26] 
and are left to the reader. 

We will now prove Proposition 13.51 Let us mention that Schatten’s 
inequality (I95p will be crucial in the proof since it will allow to link the a-th 
moment of the spectral measure fic' and the entries of C. 

Proof. Since the set of symmetric probability measures on M is closed for 
the weak topology, it is enough to prove the LDP on this set, see O Lemma 
4.1.5]. 


Upper bound. Let /x be a symmetric probability measure on M. Since the 
function ttIq, is lower semi-continuous, there exists a continuous function h 
such that /i(0) = 0 and 


P(/iC’' e < P(m„(/ic’') > maiiJ.) - h{6)) 


for all 5 small enough. Moreover, by Schatten’s inequality (|95p and the fact 
that Yli=i ^ (SiLi all > 1, fli, • • •, flfe > 0, we have 


n-\-p n+p 


n+p /n+p \ 


j=i k=i 


Consequently, 
P(/XC' G 


< 


= P 


1 


n+p 


> m„(^) - h{5) 


< 


jl^ 


for all ai G (0, a) by Chernoff’s inequality. Besides, from hypothesis (fip. 
there exists 02 € (ai,a) such that for all x large enough, P(|Xi^i| > x) < 
exp(—a 2 x“). Let us also recall the following integration by parts formula: 
for all n G V{M.) and / G C^(M, M), 


[ f{x)dn{x) = f{a)n{[a,+oo))-f(b)n{[b,+oo))+f f'{x)n{[x,+oo)) dx . 
J a J a 
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Denoting by P|Xi i| the law of we thus have 


Eexp 



pe(n 

< 

1+ / 


Jein), 

< 

1 + 

< 

1 + 




re{n) 

'£(n)y/p 




y/2 


Oi — 02 

_|_ ^-(a2-ai)£(n)°‘p°‘/'^ 


02 — Oi 


< exp 


(a2—ai)£(Ti)°’p°/^ 


02 — Oi 


hence 


P(^c' G Bs,t{fJ‘,S)) 

< exp (-^(Cn + - h(6)) + 

\ Z 02 — Ol 

So, for all 5 small enough and all oi G (0,o), 


^ _„pg-(a2-ai)£(n)“p“/2 


limsup logP(//c' G < -y (rriaM - H^)) 


nT+Z niW2 


and finally 

limsuplimsup^^^logP(/ic' £ Bs^tid-Z)) < 1^/2 

( 5^0 n ^+00 n ' z c ' 

In the case of a /i satisfying /r({0}) < have a better result. 

Indeed, inspired by [T5j, we can observe that for all £ small enough, there 
exists R > 0 such that 


1 + c 


Therefore, 

/i' G V{R) I fi'{[-R,R]) < 

is a neighbourhood of /r in which, for n large enough, almost surely, ^c' is 
not. So we have 



1 


lim lim 

(5^0n^+oo ri^+‘^/z 


logFi^C' e Bs,t{^J-,d)) 


—00 . 


(59) 
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We have obtained the upper bound of the LDP. 


Lower bound. Let E P(M) be a symmetric measure such that ;^({0}) > 
There exists jl E P(M+) such that 


= 


1 — c| 
1 + c 


<5o + 


1 A c 
1 + c 


(/2 + (-Id)#/i), 


where (—Id)((/i denotes the push-forward of fl by — Id. 

We denote by xi,..., XnAp the quantiles of fi of orders • • •, we 

also define uq = min{j E |l,n Ap] | Xj > e(n)}, and 


M' = 



with M E defined by Mjj = xj for all j E [no, n Ap] and Mj^k = 0 

otherwise. 

From ()55l) . we have 


L'M' 


nAp I I 

—^ + “TT-^0 • 

i=i 


Besides, 


m„(/r) 


2(1 Ac) 

1 + C Jo 


x|" d/2(x) > 


2(1 Ac) 


nAp 


(1 -b c)(l + nAp) ^ 




(60) 


Let us also remark that by construction, (is,i(//, Pm') tends to 0 as n —>■ -boo. 


Let (5 > 0. For n large enough, we thus have 

ds,t{p,PM>) < (61) 

Using (16X1) . the Hoffman-Wielandt inequality ([931) . the independence of the 
Xj^s, the inequalities (IM]1 and ([57)1 . the fact that 1 < no < nAp, and ([60]), 
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we get for n large enough, 
P(/iC' € Bs,t{lJ;S)) 


> P /iC' G 


1 

> PI -Tr((C'- M')^) <— 

^n + p ’ ’ ~ ^ 

j,k 

> P ^Vj G [no, nApl {Cjj - Mjjf < 


n \/{j,k) different, Cj^k = 0 


, p (v. . ^ . {m,„ - 

n y{j,k) different, \Xj^k\ < £{n)^/p 


1—r n \M / (n -n W/'r, Wr>“/2 \ {flAp nQ + l) 

> I I e~ ^ (1 — e”' ^ ^ I 

j=no 

/ nAp \ 

> -exp 

\ j=no ) 


> _exp I 


(1 + c) (l+cAl) 


1 A c 


nr„(/i) 


Note that we can apply (l56|) and (ISZI), even if it means to swap p and 
(—Id)jj/i in order to apply (l57|) . We finally get 

logP(A^C' G B,4p, 6)) > (62) 

for all J > 0. 

This is the lower bound of the LDP. 

Exponential tightness. Let A > 0 and m = We recall that 

the set Ka^rn dehned by is compact. Moreover, using the computations 
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done for the upper bound, we have 


F{i2c' i Ka,m) = f'imainc') > m) 

< exp (- — {Cn + 

V 2 02-01 / 

for all oi G (0,o) and some 02 G (oi,o). It follows that 

i K^,^) < . (63) 

The combination of ([58]), (l59]l . (f62]) . and (fHSl) is the desired LDP. □ 


3.3 Conclusion 


To conclude this section, we show how to deduce the LDP for /p (The¬ 
orem [T^I) from the LDP for fj,c' (Proposition [33]) . 

Proposition 3.8. The measure satisfies the LDP with speed 

in P(M+), for the weak topology and the good rate function T' defined by 




i/ zy({0}) > max ( 0,1 - i) 
-|-oo otherwise 


Proof. We define 


and 


T : p, 


1 




so that /xcc* = TnipLc')-, see (l53) . 


(64) 


Besides, we have 


liin ds,t{hcc^,T{iac')) = 0. (65) 

n^+oo 

Indeed, let n G N and 2 ; in the domain Vg^t defined by Q. We have 


Kcc^i^)-GTM{z)\ 

= [ d{Tn{iic')){x) - f d{T{nc’)){^ 

Jn^-x JrZ-x 


1 

2 





d{hc')i^) + 




1 

2 ; 


< 


I 

Imz 


1 

Cn 


1 

C 
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so, taking the upper bound on z G Vg^t and the limit as n —>■ +oo, we get (ESI). 


The contraction principle applied to the function T, see [9l Theorem 
4.2.1], will allow us to conclude. Indeed, T takes its values in P(M+) and is 
continuous for the weak topology. This strategy will make appear the good 
rate function T' defined for all v € P(M+) by 


T'(i/) = inf{$'(/r), /r G V{R) s.t. i/ = T(/r)} . 

For all fi G 'P(M), we have (T(/i))({0}) = ^ (l + ^) /^({0}) + ^ “ c)’ 

/r |1 —cl 1|1 —c| + (c—1) / 1\ 

K{0}) > ^ (^(m))({ 0}) > --= max (^0, ^ “ j • 


Therefore, for all u G P(M_|_) such that i^({0}) > max (0,1 — i), there exists 

a symmetric fi G T’{R) satisfying ^({0}) > and u = T{fi). We have in 
this case 




a 

C “/2 




hence la the case of z^({0}) < max (0,1 — h), we can 

not find a symmetric /x G P(M) satisfying /x({0}) > and i' = so 

'k'(zx) = + 00 . Thus, we have computed for every 12 G V{R+). 


Lower bound. Let /x G P(M+) and 5 > 0. From (1651) . for n large enough, 
we have ds,t(/^ccG ^(l^C')) < |> hence 


^{Lcc^ € i?s_t(/x, (5)) > P ^T(/xc/) G Bg^t • 

By Proposition 13.51 and the contraction principle, we thus have 

liminf ^ logP(^cct G Bg^ti^^,S)) >- mf > -^''(1^) • 

n-)-+oo i/eSs t(/i,5/2) 

( 66 ) 


Upper bound. Let F be a closed subset of F(M+) and <5 > 0. From (IS5D . 
for n large enough, we have dg^tifJ^cCLT^fJ.c')) < so 

nfiCc^GF)<¥{T{pc')^F^), 

where F^ denotes the ^-neighbourhood of F for the distance dg^t, namely 
= {zx G F(M+) I 3^ G F, dg,t{L, 1 ^) < 6} . 
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Applying the contraction principle again, we thus have 

limsup ^ logP(;Uc’c* <- inf . 

n^+oo n ^ ' v^F° 

This is true for all 5 > 0 so, taking the limit as (5 —>■ 0, we get (see [9l Lemma 
4.1.6(a)]) 

limsup ^ logP(^cct G F) < - inf T'(z^). (67) 

Combining (f66]l and (|671) . we can conclude that /rcc* satisfies the an¬ 
nounced LDP. □ 

Because rectangular free convolution is continuous for weak topology, see 
O Theorem 3.12], the function fi i—)• (^y/Jl fflc ^A*mp,c) is so, therefore, by 
Proposition 13.81 and the contraction principle, (^/Ucc‘ ■y//^MP,c)^ satisfies 

the LDP with speed on F(M_|_) governed by the good rate function 

_ / 'k'(i/) if there exists u G F(M_|_) such that fi = (yT" fflc -^/^MP,c)^ 

\ -|-oo otherwise 

Thanks to the exponential equivalence between fixx^/p and (^//cc* ^//7 mp]c) ^ 

obtained in Proposition 13.11 we can conclude that Hxx^/p satisfies the same 
LDP, see [HI Theorem 4.2.13], which ends the proof of Theorem 11.71 

A Concentration bounds for the information-plus- 
noise model 

A.l Concentration for some functions of the resolvent 

In Section 121 in order to prove Lemma 12.51 we needed the following concen¬ 
tration estimates. 

Proposition A.l (adaptation from [171 Lemma 8]). Let Y G A4n,p(I^) be 
a random matrix with i.i.d. entries, let M G Ain,p(M) be a deterministic 
matrix and /et z G C \ M. 

We define X = -^ + M, S = {zin — XX^)~^ the resolvent of XX^, Cn = 
and = Var(li^i). 

ITe assume that the distribution of Lip has mean zero and satisfy the fol¬ 
lowing Poincare inequality: 

V/ G C\R,R) s.t. E(/'(yi,i))2 < + 00 , Var(/(yip)) < a^K{f'{Y^^i)^). 

Then, for all deterministic matrices U G A4n(K) and V G Atn,p(I^); and for 
all integers n,p, we have 

Var (^Tr(SD)) < ^^uiz)\\U\\{Tv{UU^))^/^ (68) 


44 











and 


Var 


- Tt{X^SV) ] < 


9(7^ Cn 


v{z) 


X max 


(Tr(yyO)V2^ ||y||3/2 ||y||5/4('^(^^'))'^®' 


n 


1/2 


n 


n' 


3/8 


(69) 


where 

and 

1 |z| ^ 1 ^ 1^1 _|_ ^ 

limzp’limzp I Imzp ’ \| Imzp \lmz 

Remarks. • In the proof of Lemma 12.51 we apply (|68l) to U = In and 
U = R, and we apply (1691) to y = M. 

• Since ||ld|| < Tr(I/y*)^/^, HMD implies 

Var Tr(X*5V)^ < ^^^v{z) Tr(VV‘). (70) 

Having in mind the large deviations in Section [3l we want to get a 
bound in Tr(MM*) in Lemma 12.51 that is why we use (I70p instead of 
(fMli in its proof. 

• We get here slightly better bounds than HZj. Indeed, we can recover 
their results from to ours since Tr(HH*) < n||H|p (see Proposition IB. II 
(iv)). This improvement is due to the fact that we used the inequality 

I Tr(HC)| < v^||H||(Tr(CC*))^/^ (see Proposition[BT] (hi)) instead of 
|Tr(HC)| <n||H||.||C||. 

• If the distribution of Y/q satisfies the Poincare inequality with a con¬ 
stant C instead of ci^, then cr^ must be replaced by C in the bounds 
(1681) and (IMl) . 

• In the case of complex matrices Y, M, U, V, the bounds are very similar 
and only the constants change. 

Proof. Using the sub-additivity property of variance, the Poincare inequal- 
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ity, and the differentiation formula ()84D . we get 


Var('iTV(S[/))=Var( 

^ ' Vl<i,fc<n / 

^ ^ 1 ~ ^ ^ '^Da bSj k-Uk j 

( ^^{S^)j,bSa,kUk,j + Sj^a{X S)b^kUk, 


< 0-^E 


= a^E 


2n 


a,b 


n^p 


hk 


a 

ri^p 


a 

n^p 


a 

rfip 


E 


E 


Y,{{SUSX)a,b + {X^SUS\a 


a,b 


^iSUSX)l, + 2{SUSX)a,biX^SUS)b,a + {X^SUS)l^ 

a,b 


E [Tr{SUSX{SUSXY) + 2Ti{SUSXX^SUS) 


+ Tr{X^SUS{X^SUSY)] ■ 


(71) 


Using the resolvent identity SXX^ = XX^S = zS — In, the inequality 
|Tr(Ai?)| < ^/nT^:{AA*Y/‘^\\B\\ (see Proposition [Rl] (hi)), and ||5|| < 

(see Proposition IB.2I livil. we get 


|Tr(5U,SX(S[/5X)‘)| = \Tt{U{zS - In)SU^S^)\ 

< ^||t,||Tr(C/C/r^(J^ + ^)72) 

and very similarly, 

|IV(S(7SXX‘SC/S)| < M\U\\ MUU‘)'I^ + Jj^) (73) 

and 

|Tr(X*5C/5(X*5U,S)*)| < v^||C/|| Tr([/[/*)i/2 ( \A ^ —i—^ . (74) 

' \|lm^p \lm.z\'^ J 

Combining (ITT]) . (1721) . (173)) et (I74|) . we conclude that 
Var (1 Tr(SU)) < ^ + -^) \\U\\ Tr([/C/*)^2 . 
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Let us now prove the second inequality. By the same arguments as above, 


we get 




a,b 



+ Tv{X^SVX^S{X^SVX^SY) + 2Tr{SVX^SX{SVY) 

+ 2 Tv{SVX^SVX*S) + 2 Tv{SVX^SXX^SVX^S)] . (75) 


We will now bound these terms, always using the resolvent identities 


SXX^ = XX^S = zS - In, XX^S* = S*XX^ = zS* - In, inequalities 
(i)-(iii) in Proposition IB.li and the bound ||5|| < We get for example 


|Tr(,Sl/(5P)*)| = |Tr(5PP*5)| < v^||P|| (76) 


and 


\Tv{SVX^SX{SVX^SXf)\ 

= \Ti{V*S^VX\zS - In)SX)\ 

< TT{V*S^VV\S*fV)^/^ Tt{X\zS - In)SXX*S*{zS* - In)X)^^^ 

= Tt{V^S^VV\S*)^V)^/^ Tt{S*{zS* - In)XX\zS - In){zS - 4 ))^/^ 

< (v^llPf ||5f Tr(Pf^*)i/2y'''Tr((z5* -/„)2 (z5-4)2)V2 



Very similarly, we get 


|TV(V*5VV*5(X*5VV*5)*)| 
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|TV(5FX‘5Xy‘5)| <n^/®||Ff/^Tr(yy*)3/8 f—^ + —i—^ , (79) 

|■I^(Sr.Y‘SFA"S)| < Villi'll MVV‘YI^ (p^ + p^) . ( 80 ) 

and finally 

\Ti(SVX^SXX^SVX^S)\ < v^||y|| 

' ' yllmzl^ jlmz 

(81) 

Combining inequalities from (I75p to (18111 , we finally obtain the announced 
bound. □ 

Remark. In the proof above, it is possible to improve some majorizations 
using the inequality US'XH < where y is a square root of z (see Propo¬ 
sition |B^ (v)). For instance, it allows to get 

\Tr(X^SVX^S(X^SVX^SY)\ < V^Tt(VV^V/^\\V\\.—^-^ 

\ Imyp 

instead of (j78ll . However, in (1771) . which is the other dominant term in (I75p . 
we can not improve the power of n by this strategy. 



A.2 Concentration of the empirical spectral measure 

In Section [3l in order to prove Proposition 13.11 we needed the following 
concentration bound. 


Proposition A.2 (adaptation from [TJ Theorem 2.5]). Let k > 1, T 


M 


n,p\ 


a random matrix with i.i.d. entries bounded by k, M £ Xi 


n,p 


deterministic matrix such that ^Tr(MM*) < and s,t > 0. IFe assume 
that Cn = ^ ^ c G (0, -(-oo) as n ^ +oo. 

There exists /3 > 0 such that for all s large enough, t small enough, n large 

[■ /„ 2\2/5 1 

enough, and 5 G ( ) ,1 , we have 


P 


ds,t (^h(Y/^+M){Y/^+M)£^ T{Y/^+M){Y/^+My'^ > ^ 



Remarks. • Here k is a constant but we are interested in the depen¬ 
dence on K in the bound since we apply (1521) to a k depending on n in 
Section [3l 


• This result remains true if Y and M are complex matrices and the 
entries of Y have independent real and imaginary parts. 
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Proof. We will apply m Theorem 1.3(b)] to the (n + p) x (n + p) matrix 


X. = 


0 

Y 

+ M 

(*+")■ 

0 


The matrix M is not present in |13j but it is possible to do so because 

2 

< 

np 




3,k 


j,k 


SO, thanks to the hypotheses on Y and M, we have ^^Tr(X^) < 8k^. 
Therefore, the argument in m p. 132] does not change and we can apply 
m Theorem 1.3(b)] adding the matrix M. Consequently, there exists /? > 0 


such that for all n large enough and d G 




, we have 


sup 

. / 


/ rfffXA - E / / d/ix^ 


> A 1 < 

- - 5^/^ 


where the supremum is taken over all bounded Lipschitz functions / such 
that 

f{x) - fiy) 


sup l/(x)j + sup 


x-y 


< 1 . 


(83) 


Moreover, using (|9nii . we can check that when s is large enough and t 
is small enough, for every 2: G 14,t, the function f : x >-)■ is Lipschitz, 

bounded, and satisfies (f83|) . Noting in addition that 


z — x^ 






+ (n -p)-~ 


and using the definition (l5|) oi dgt, we find (l82]) even if it means to change 

/ 3 . ’ □ 


B Technical tools 

In this appendix, we summarize miscellaneous results used throughout the 
paper. 
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B.l Traces and matricial norms inequalities 

For a matrix A G A4n,p(C), we denote by ||yl|| its operator norm associated 
to Euclidean norms and 


||y4||oo = max \Aj k\ ■ 

If B is an other matrix in Atn,p(C), we denote hy A o B the Hadamard 
product of A and B, i.e. the matrix defined by {AoB)j^k = Finally, 

diag(74) denotes the matrix whose entries are given by Aj^kSj,k, where S is 
the Kronecker symbol. 

Proposition B.l. Let A, B ^ Ain,p{C), C G Atp,n(C), D G Atn(C), E G 
Atp,g(C). We have the following. 

(i) |Tr(AC)| < (Tr(AA*))V2(Tr(C'C*))V2^ 

(^^) I Tr(AC)| <nP||.lie'll, 

(ivi) |Tr(AC)| < ^\\A\\{Ti{CC*)Y/‘^, 

(tv) Pll < (Tr(^A*))V2 < 

(v) {Tt{AA*))B^ < V^PIU, 

(vi) ||diag(F))|| = || diag(F))||oo < ||tl||oo; 

(vii) 11 ^ o B||oo < Pllooll^lloo; 

(vin) ||y4oB|| < ||yl||.||i?||, 

(ix) Tt{{AoB){AoB)*) < Tr(yl^*)||B||^. 

Most of these points are classical or easy to check. Note that the com¬ 
bination (iii) of (i) and (ii) will be crucial for us and that a proof of (viii) 
requires the use of the matrices 

^ ^1,1 ^1,2 \ 

A = 

\ ^n,2 .^n,p ) 
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and 



B.2 Properties of resolvents 

Let A G TiniC) and z G C\M. The resolvent of A at z is the matrix 
R{A) = {zin — A)~^. For A G M-n,p{C), we denote by 5(yl) the resolvent 
R(AA*), or just S if no confusion can arise. 

Proposition B.2. Let A,B€ Ain,p{C) and z G C\M. We have the fol¬ 
lowing. 

(i) SAA* = AA*S = zS- 

(ii) S{A + B)- S{A) = SiA + B){AB* + BA* + BB*)S{A), 

(in) = ^Tr5, 

(iv) Halloo < \\S\\ < 

(v) IIS'Alloo < ll'S'^ll < where y is a square root of z, 

(vt) + 

(vii) We denote by Da^b the derivation w.r.t. Re^a.fe o,nd 5 the Kronecker 
symbol. For all a,j,k G |l,n]] and b,l,m G |l,pl, we have 

Da,bSj,k = {SA)j^bSa,k + Sj,a{A*S)b,k , ( 84 ) 

Da,b{SA)j,l = {SA)^^biSA)a,l + Sj,aiA*SA)b,l + 6b,lSj,a , ( 85 ) 

Da,b{A*S\k = {A*SA\bSa,k + {A*S)l,aiA*S)b,k + Sb,lSa,k , ( 86 ) 

Da,biA*SA\rn = {A* S A\biS A)a,m + {A* S\a{A* S A\m 

+ ^b,m{A* S)ia + db^l{SA)a^m , ( 87 ) 
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^l,bSj,k = ‘^[Sj,aSa,k + {SA)jfi{SA)a,bSa,k + Sj^a{A* SA)b^bSa,k 

+ Sj,a{A*S)b,aiA*S)b,k + {SA)j,bSaAA*S)b,k] , ( 88 ) 


DlbSj,k = Q[{SAAbSa,aSa,k + SjAA*S)b,aSa,k + SjASA)a,bSa,k 
+ S,,aSa,aA*S)b,k + {SA)jASA)l,Sa,k + 5 ,- 5 )^ ^ 

+ {SA)jASA)a,bSa,aA*S)b,k + (S A^bSaA^* S)b,aiA* S)b,k 
+ SjAA*SA)bASAAbSa,k + {SAAbSaAA*SA)b,bSa,k 
+ SjAA*S)bAA*SA)b,bSa,k + SjAA*SA)b,bSaAA*S)b,k] • (89) 


Most of these relations are classical or obtained by simple computations. 
Note however that (v) and (vi) respectively follow from the identities 


yhn 


( yS* 

AitA - AM)-1 

-A* 

yip ) 

( (5A)* 

At Ip - a*a)-^ 


and A*{zln - AA*)-^A = -Ip + z{zlp - A*A)-K 

Note also that if y is a square root of z and z belongs to the domain Vg^t 
defined by ([6|), then we can easily prove that 


Imy| 


Imz / j (Re z)^ ~ Rez\ 

2 (Imz)2 Imzy 


> (|(V^t^-t))'^'>0. (90) 


B.3 Inequalities for empirical spectral measures 

Proposition B.3 (Rank inequality, see [71 Lemma B.l]). LetA,B E 'R„(C). 
We have 

dKsAA,hB)<-Tank{A-B). (91) 

n 

Proposition B.4 (Rank inequality for covariance matrices, see O Theorem 
A.44]). Let A,Bg Atn,p(C). We have 

d-KS Aaa* , hBB*) < - rank(A - B). (92) 

n 

Proposition B.5 (Hoffman-Wielandt inequality, see [71 Lemma B.2]). Let 
A,B^ BniC). We have 

WiAA,LB)<-Tt{iA-B)^), (93) 

n 

where W 2 denotes the Lp'-Wasserstein distance on V(R). 

Proposition B.6 (see [3l Corollary A.42]). Let A,B £ Atn,p(C). We have 

WiAAA*,hBB^) < ^Tt{AA*+BB*)T i{{A-B){A-BY). (94) 


52 











Proposition B.7 (Schatten’s inequality, see [20l Theorem 3.32]). Let A G 
TiniC) and p G (0,2], We have 


[ \x\PdpA{x) < - 

^ k=l 



(95) 
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